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1.  Summary 

In  this  final  report  we  summarize  our  accomplishments  in  the  research  program 
supported  by  Grant  AFOSR-92-J-0002  over  the  period  from  October  1,  1991  to 
September  30,  1994.  The  basic  scope  of  this  research  program  was  to  carry  out 
fundamental  research  in  the  analysis,  estimation,  processing,  and  control  of  complex 
signals  and  systems  with  particular  emphasis  on  (a)  the  development  of  theories  and 
processing  methodologies  for  signals  possessing  multiple  resolution  descriptions  and  for 
systems  with  multiple  time  scales  using  concepts  related  to  wavelet  transforms  and 
pyramidal  processing  structures  for  the  former  and  singular  perturbation  methods  for 
discrete-state  systems  for  the  latter;  and  (b)  the  analysis  of  complex,  discrete-event 
systems,  providing  a  more  qualitative  control-theoretic  counterpait  to  the  analytic 
perturbation  methods  under  (a).  The  principal  investigator  for  this  effort  is  Professor 
Alan  S.  Willsky,  and  Dr.  W.  C.  Kaii  is  co-principal  investigator.  Professor  Willsky  and 
Dr.  Karl  have  been  assisted  by  several  graduate  research  assistants  as  well  as  additional 
thesis  students  not  requiring  stipend  or  tuition  support  from  this  grant. 

We  feel  that  our  work  under  this  grant  has  been  highly  successful  not  only  in 
producing  a  significant  number  of  results  and  publications  but  also  in  both  uncovering 
new  reseai'ch  questions  and  directions  for  future  investigations  and  in  providing  advanced 
technology  of  direct  relevance  to  Air  Force  missions  and  programs.  Indeed  discussions 
with  Air  Force  personnel  and  with  other  organizatios  working  on  Air  Force  projects  have 
provided  significant  motivation  and  direction  for  the  research  we  have  carried  out  and 
that  we  plan  for  the  future.  At  the  end  of  this  report  we  have  included  a  list  of  the 
publications  supported  by  Grant  F49620-92-J-002.  In  total  the  publications  resulting 
from  research  supported  by  Grant  F49620-92-J-002  include  23  papers  that  have  appeared 
or  been  submitted  to  journals,  9  journal  papers  presently  in  preparation,  25  papers 
presented  at  conferences,  2  book  chapters,  3  S.M.  theses,  and  5  Ph.D.  theses.  We 
anticipate  that  several  additional  papers  will  also  be  written. 

In  addition,  our  work  has  received  considerable  national  and  international 
attention  and  recognition.  In  particular: 


Professor  Willsky  was  one  of  the  Guest  Editors  (along  with  Prof.  Stephane  Mallat 
and  Dr.  Ingrid  Daubechies)  of  the  IEEE  Transactions  on  Information  Theory 
special  issue  on  wavelet  transforms  and  multiresolution  signal  analysis, 

March  1992. 

Professor  Willsky  was  invited  to  give  the  keynote  address  at  the  inaugural 
workshop  for  the  Centi'e  for  Robust  and  Adaptive  Systems,  Canberra,  Australia, 
Februai-y,  1992.  The  official  center  opening  was  performed  by  The  Hon.  Ross 


Free,  M.P.  and  Minister  of  Science  and  Technology.  Prof.  Willsky  gave  lectures 
on  both  of  the  principal  components  of  the  research  supported  by  this  grant, 
namely  multiresolution  signal  and  image  processing  and  modeling  and  detection 
of  abrupt  changes  in  systems  and  signals  and  related  problems  of  discrete  event 
systems.  In  addition.  Prof.  Willsky  gave  a  speech  on  the  role  of  national  research 
centers  and  on  the  issues  involved  in  technology  transfer  and  partnerships 
between  universities,  industry,  and  government. 

Prof.  Willsky  was  asked  to  give  the  principal  lecture  on  "Multiresolution  Methods 
in  Statistical  Image  Analysis,"  at  the  Tri-Service  Workshop  on  Statistical  Methods 
in  Image  Processing,  Harry  Diamond  Lab,  Maryland,  May  1992. 

Dr.  Mark  Luettgen  a  Ph.D.  student  with  Prof.  Willsky  presented  an  invited  talk  on 
"Multiscale  Signal  Processing"  at  the  American  Mathematics  Society  Conference 
on  Wavelets  and  Applications,  S.  Hadley,  MA,  June  1992. 

Prof.  Willsky  delivered  a  plenary  addi'ess  at  the  SIAM  Conference  on  Control  and 
Its  Applications,  Minneapolis,  September  1992.  The  subject  of  this  address  was 
systems  and  control  challenges  in  image  analysis,  with  particular  emphasis  on 
multiresolution  models  and  methods. 

Prof.  Willsky  gave  one  of  the  principal  plenary  lectures  at  the  INRIA  Qnstitut 
National  de  Recherche  en  Informatique  et  en  Automatique)  Twenty-Fifth 
Anniversary  Symposium,  held  at  the  French  Ministry  of  Research,  Paris,  France, 
December  1992. 

Dr.  Karl  was  asked  to  organize  an  invited  session  on  geometric  and  stochastic 
methods  in  image  analysis  and  reconsuajction. 

Prof.  Willsky  gave  the  opening  keynote  address  at  the  IEEE  Image  and 
Multidimensional  Signal  Processing  Workshop,  Cannes,  France,  Sept.  1993. 

Dr.  Karl  has  been  asked  to  serve  on  the  organizing  committee  for  the  upcoming 
Workshop  on  Wavelets  in  Medical  Image  Processing  to  be  held  at  Johns  Hopkins. 

Prof.  Willsky  was  asked  to  give  a  plenary  lecture  on  estimation  and  statistical 
inference  problems  in  image  analysis  at  the  American  Statistical  Association 
Meeting  in  Toronto  in  August  1994. 


In  addition,  our  work  has  also  been  dtiectly  recognized  by  the  Air  Force  as 
evidenced  by  the  following: 

Our  work  was  cited  in  the  1991  AFOSR  Research  Highlights  publication. 

Prof.  Willsky  was  one  of  the  paiticipants  in  the  March  1992  AFOSR-sponsored 
Workshop  on  Applications  of  Wavelet  Transforms  in  Signal  Processing,  held  at 
The  Air  Force  Institute  of  Technology,  Wright-Patterson  Air  Force  Base,  Ohio. 

Prof.  Willsky  delivered  an  invited  talk  on  Multii'esolution  Signal  Analysis  in 
Rome,  New  York,  sponsored  both  by  the  local  chapter  of  the  IEEE  and 
by  personnel  at  Rome  Laboratory  (Mr.  John  Graneiro  and  Mr.  Vincent 
Vannicola). 


Mr  Eric  Miller,  one  of  Prof.  Willsky's  Ph.D.  students  working  on  multiresolution 
methods,  was  awai'ded  an  Air  Force  Doctoral  Scholarship,  championed  by  the 
signal  processing  and  surveillance  professionals  at  Rome  Laboratories. 


Furthermore,  there  has  been  considerable  interaction  with  Air  Force  personnel,  a  number 
of  examples  of  technology  transfer,  and  a  variety  of  research  problems  Aat  have  ^en 
inspired  by  discussions  with  Air  Force  personnel  or  that  have  been  motivated  by  Air 
Force  missions  and  needs: 

Our  work  on  multiresolution  image  processing  has  formed  the  basis  for  several 
projects  in  surveillance  applications.  In  particular,  an  AFOSR-sposored  Phase  II 
SBIR  project  on  multiresolution  image  fusion  (being  performed  by  Alphatech, 

Inc.)  is  a  direct  outgrowth  of  our  MIT  research.  In  addition,  several  additional 
SBIR  projects  involving  the  transition  of  our  research  are  currently  being  pursued 
with  Rome  Laboratory  (in  remote  sensing  and  surveillance). 

In  Mai-ch  1992,  Prof.  Willsky  met  with  Drs.  S.  Banda  and  P.  Chandler  of  Wright 
Laboratories  to  discuss  problems  in  robust  failure  detection.  Prof.  Willsky  s 
previous  work  in  failure  detection  has  already  found  its  way  into  flight-tested 
systems,  and  he  is  recognized  as  one  of  the  leading  authorities  in  this  area.  The 
limitations  of  existing  methods  has  led  us  again  to  formulate  new  research 
problems  (see  Section  V). 

Prof.  Willsky  and  Dr.  W.C.  Karl  have  been  engaged  in  extensive  discussions  and 
collaboration  with  engineers  at  Lincoln  Laboratory  concerning  the  incorporation 
of  advanced  multiresolution  methods  into  integrated  automatic  target  recognition 
systems.  Lincoln  Lab  serves  as  the  Center  of  Excellence  for  this  ARPA’s 
program  in  this  area,  and  Prof.  Willsky  has  established  a  continuing  relationship 
with  Lincoln.  As  part  of  this  interaction,  we  have  recently  demonstrated  that  our 
multiresolution  image  analysis  methods  can  significantly  enhance  performance  of 
existing  ATR  algorithms  such  as  the  one  that  has  been  developed  over  a  number 
of  years  at  Lincoln. 

In  addition.  Prof.  Willsky  has  established  contact  with  Air  Force  efforts  in  ATR, 
most  notably  at  Wright  Laboratory  with  Mr.  E.  Zelnio  and  his  group.  In 
particular,  in  June  1994,  Prof.  Willsky  visited  Mr.  Zelnio's  group  together  with 
Dr.  Charles  Holland  and  Dr.  Abe  Waksman  in  order  to  engage  in  extensive 
technical  disucssions  and  to  review  ongoing  research  activities  at  Wright  Lab. 

Prof.  Willsky  paiticipated  in  discussions  with  Drs.  C.  Holland  ^d  N.  Classman  of 
AFOSR  in  order  to  assist  them  in  defining  a  new  initiative  in  discrete  event 
systems  (in  which  they  were  ultimately  successful). 

Prof.  Willsky  and  Dr.  Karl  have  also  recently  initiated  discussions  with 
researchers  at  Phillips  Laboratoiy  (in  particular,  we  have  recently  met  with  Dr.  S. 
Cusumano)  dealing  with  issues  related  to  optical  tracking  for  the  High-Energy 
Laser  program. 

Prof.  Willsky,  Dr.  Karl,  and  graduate  student  Mr.  M.  Bhatia  have  recently 
initiated  interactions  with  other  MIT  researchers  (led  by  Dr.  R.  Lanza  of  MIT  s 
Laboratory  for  Nuclear  Science)  involved  in  advanced  methods  of  nondestructive 
evaluation  for  applications  such  as  con'osion  detection  in  aging  aircraft. 


IL  Research  Progress  Description 

In  this  section  we  briefly  describe  the  research  accomplishments  we  have 
achieved  over  the  three-year  period  of  our  grant.  We  limit  ourselves  here  to  a  succinct 
summary  of  these  results  and  refer  to  the  publications  listed  at  the  end  of  this  report  for 
detailed  developments.  Our  research  can  roughly  be  divided  several  parts  which  we 
describe  in  the  following  paragraphs. 

Multiresolution  approaches  to  mage  analysis 

The  research  described  in  this  section  is  reported  in  a  number  of  papers  and 
reports  [16-32,  34,  41-44,49-50,54,62].  The  basic  idea  behind  our  work  is  the  use  of 
scale-recursive  stochastic  models  to  describe  signals,  images,  and  phenomena  in  1,2,  or 
higher  dimension  (i.e.,  where  not  only  the  process  values  but  also  the  independent 
variable  may  be  multidimensionsal— e.g.,  to  describe  spatially-distributed  phenomena). 
More  precisely,  the  models  that  we  have  developed  evolve  on  pyramidal  structures, 
namely  trees,  where  each  level  in  the  tree  corresponds  to  a  particular  resolution  of 
representation  and  where  the  process  is  described  by  scale-to-scale  dynamics.  For  the 
most  part  the  dynamics  that  we  have  used  are  linear,  although,  as  we  discuss  shortly,  the 
framework  can  readily  accommodate  nonlinear  and/or  discrete-valued  dynamics  and 
variables.  More  precisely,  the  dynamics  we  have  considered  are  of  the  form 


x(t)  =  A(t)  x(tY)  -r  B(t)  w(t) 

(1) 

y(t)  =  C(t)  x(t)  -H  v(t) 

(2) 

where  t  is  the  index  of  nodes  on  a  multiresolution  U’ee  over  which  our  model  is  defined. 

In  particular-,  t  should  be  thought  of  as  indexing  a  pair  (scale,  location)  where  "scale" 
indexes  the  level  on  the  ti'ee  (i.e.,  the  resolution)  con-esponding  to  node  t  and  "location" 
indexes  the  spatial  or  temporal  location  to  which  this  node  corrsponds.  The  node  t  is 
connected  to  a  number  of  child  nodes  at  the  next  finer  scale  representing  finer  scale 
descriptions  of  the  phenomenon  and  a  finer  scale  partitioning  of  the  region  corresponding 
to  node  t.  While  the  number  of  children  is  arbitrary  in  general,  the  standard  cases  usually 
considered  are  that  of  a  dyadic  tree  (where  each  node  has  2  children)  for  1-D  processes 
and  a  quad-tr-ee  (4  childr-en)  for  2-D  processes  (although  it  is  certainly  possible  to 
represent  either  1-D  or  2-D  processes  with  U'ees  with  any  number  of  children  and  in  fact 
with  space  and  scale-vai'ying  numbers  of  children) 


In  general  the  process  x(t)  is  a  vector  process  capturing  aU  of  the  relevant 
information  about  the  phenomenon  of  interest  at  that  particular  scale  and  location  so  that 
given  x(t),  finer-resolution  detail  at  this  location  is  independent  of  the  behavior  of  the 
process  outside  this  local  region.  Also,  we  use  the  notation  ty  to  denote  the  parent  of 
node  t,  namely  the  next  coarser-scale  node  covering  the  region  corresponding  to  node  t. 
Thus  what  (1)  states  is  that  the  finer- scale  representation  x(t)  of  our  phenomenon  at  node 
t  is  obtained  by  interpolating  the  coarser-scale  description  xfty)  at  parent  node  ty  and  then 
adding  some  independent  finer  scale  detail,  captured  by  the  term  B(t)w(t).  Equation  (2) 
then  models  the  available  data,  which  may  be  available  at  multiple  scales  of  resolution. 
An  important  fact  here  is  that  the  model  parameters  (A,  B,  C,  and  the  noise  intensities) 
may  vary  from  node  to  node.  This  allows  us  to  capture  a  number  of  important  features, 
including  scale-varying  statistics  (which,  for  example,  allows  us  to  capture  scaling  laws 
used  to  describe  fractal  processes  such  as  those  with  1/f  -  like  spectra)  and  the  availability 
of  irregularly  sampled  data  in  both  space  and  resolution  (allowing  us  to  capture 
frequently  encountered  situations  in  sensor  fusion,  remote  sensing,  and  other  contexts  in 
which  complete  coverage  data  may  not  be  available  at  any  scale  but  there  may  be  an 
irregular  pattern  of  overlaps  of  coverage  of  sensors  with  different  resolutions) 

In  our  work  we  have  been  able  to  demonstrate  the  considerable  power  of  this 
modeling  framework.  In  particular,  some  of  the  important  results  of  our  work  are  the 
following: 

1.  The  simplest  example  of  a  model  of  the  form  of  (1),  (2)  is  the  so-called  Haar 
model,  in  which  x(t)  simply  represents  the  average  value  of  the  phenomenon  of  interest 
over  an  intei-val  of  length  2'^  (for  1-D  processes  using  a  dyadic  tree  and  where  the  scale 
of  t  is  denoted  by  m)  or  over  a  square  with  side  of  length  2'^^  (for  2-D  processes  using  a 
quad- tree).  However,  the  framework  desci'ibed  by  (1),  (2)  is  in  fact  far  richer  than  this 
simple  example.  In  pai'ticular,  as  described  in  [26],  any  Markov  process  in  1-D  (and  in 
fact  any  process  in  the  somewhat  larger  class  of  reciprocal  processes  or  l-D  Markov 
random  fields)  can  be  exactly  represented  as  in  (1),  (2)  using  a  generdization  of  the  so- 
called  midpoint  deflection  construction  of  Brownian  motion.  In  particular,  for  any  such 
process,  given  the  value  at  one  point,  the  values  to  its  left  and  right  are  independent,  and 
this  fact  then  leads  dkectly  to  a  method  for  constructing  the  process  that  is  exactly  of  the 
form  of  (1),  (2).  Moreover,  it  is  also  shown  in  [26]  that  in  principle  any  Markov  random 
field  (MRF)  in  2-D  may  also  be  exactly  modeled  using  a  multiresolution  model  as  in  (1), 
(2).  The  idea  here  is  a  generalization  of  that  used  in  1-D.  Specifically,  if  we  wish  to 
represent  an  MRF  over  a  spatial  region,  then,  given  the  values  of  the  field  over  any  line 
separating  the  region  into  two  disjoint  regions,  then  the  values  of  the  field  in  these  two 
regions  are  independent.  This  again  leads  to  a  method  for  constructing  the  MRF  exactly 
as  in  (1),  (2).  The  problem,  of  course,  in  2-D  is  that  the  memory  needed  to  decorrelate 
the  behavior  of  the  field  over  the  two  regions  is  an  entire  line  of  values  of  the  field  (and 
not  simply  the  value  of  the  process  at  a  point),  so  that  the  "state"  dimension  of  x(t) 
depends  on  the  size  of  the  reigion  of  interest-t/iat  is  if  we  truly  wish  to  model  the  MRF 
exactly.  In  particulai',  since  MRF  models  ai'e  idealizations  themselves,  we  were  led  to 


the  idea  of  using  an  approximate  multiscale  representation  for  such  a  field,  where  at  each 
scale  we  assume  that  the  two  disjoint  subregions  of  the  region  corresponding  to  node  t  are 
independent  when  we  are  given  a  coarse  approximation  of  the  values  of  the  field  along 
the  boundary  between  the  two  subregions.  Specifically,  if  we  imagine  taking  the  wavelet 
transform  of  the  values  of  the  field  along  the  1-D  boundary  between  the  two  regions,  then 
our  approximate  models  consist  of  assuming  that  the  two  regions  ^e  independent  given ^ 
only  a  certain  knowledge  of  only  some  of  that  transform.  By  varying  how  much  some 
is  we  can  trade  off  model  complexity  (in  terms  of  tire  dimensionality  of  the  wavelet 
coefficients  that  must  be  captured  in  x(t))  with  accuracy  in  approximating  the  MRF 
(where  we  have  an  exact  representaion  if  all  of  the  wavelet  coefficients  are  kept).  Note 
that  as  we  move  to  finer  scales,  the  size  of  the  regions  being  considered  (which  are 
continually  subdivided  as  we  move  from  scale  to  scale)  become  smaller,  so  that  the 
"coarse"  representation  that  we  are  keeping  in  fact  becomes  rather  fine  (and  eventually  is 
exact).  While  a  rational  and  mathematically  precise  methodology  for  determining  when 
this  method  is  useful  (in  that  acceptably  accurate  models  are  obtained  with  acceptably 
low  dimensionality)  is  yet  to  be  developed,  we  have  demonstrated  [26]  that  this  approach 
does  yield  surprisingly  good  models  of  very  low  dimensionality  for  a  variety  of  MRP  s. 
Since  MRP's  ai'e  widely  accepted  as  a  very  rich  modeling  frannework,  what  we  have 
demonstrated  is  that  our  multiresolution  modeling  framework  is  at  least  as  rich— and 
indeed  richer,  since  it  provides  a  much  more  direct  way  in  which  to  capture  correlation 
structures  at  different  scales  than  can  be  captured  using  MRP's.  Moreover,  as  the  next 
point  makes  clear,  these  models  have  additional  and  very  significant  advantages. 

2.  One  of  the  gi'eat  strengths  of  the  modeling  framework  provided  by  (1),  (2)  is 
that  it  leads  to  extremely  powerful  and  efficient  optimal  processing  algorithms.  In 
particular,  as  described  in  our  papers  (see  in  particular  [25]),  the  optimal  estimation 
problem  for  (1),  (2)-i.e.,  the  construction  of  the  optimal  estimate  of  x(t)  at  each  node 
(and  thus  at  every  scale  and  location)  based  on  all  of  the  data  given  in  (2)  has  a  very 
efficient  solution,  which  involves  the  scale-recursive  generalization  of  the  powerful  time- 
recursive  estimation  methods  of  systems  and  conti'ol,  namely  Kalman  filtering  and 
Rauch-Tung-Striebel  smoothing.  In  particular,  the  optimal  estimator  consists  of  a  fine-to- 
coarse  Kalman  filtering  sweep  in  which  at  each  node  we  compute  the  best  estimate  at  that 
node  given  all  of  the  data  available  at  nodes  in  the  subU’ee  beneath  that  node,  followed  by 
a  coarse-to-fine  smoothing  sweep,  again  following  the  topology  of  the  tree.  The  end 
result  is  an  algorithm  that  produces  optimal  estimates  and  corresponding  en’or 
covariances  at  each  scale  and  location  with  computational  complexity  per  data  point  at 
the  fine  scale  that  is  constant  independent  of  the  size  of  the  data  region  being  considered. ^ 
While  this  is  not  surprising  for  1-D  signals  (where  the  usual  Kalman  filter  does  this),  it 
represents  a  major  breakthrough  for  2-D  processes.  In  particular,  solving  optimal 
estimation  problems  for  MRF  models  roughly  coiTesponds  to  solving  elliptic  partial 
differential  equations,  which  generally  have  complexity  that  grows  with  the  size  of  the 
domain  being  considered.  Moreover,  when  such  models  are  used,  one  essentially  never 
computes  eiTor  covai'iances,  since  the  complexity  of  that  calculation  dwarfs  that  of  the 
calculation  of  the  estimates.  With  our  models,  we  not  only  get  the  estimates  and  the  error 
covariances  with  constant  computational  complexity  per  image  point,  but  we  also  get 
both  of  these  at  multiple  scales  of  resolution  and,  as  shown  in  [54]  we  actually  get  a 
complete  multiscale  model  for  the  estimation  errors  so  that  we  in  essence  have  a  complete 
description  of  the  coirelation  between  estimation  eiTors  at  any  locations  and  scales. 

These  features  are  of  profound  importance  for  many  of  the  applications  of  practical 


iThis  follows  since  (a)  each  node  in  the  tree  is  visited  twice-once  on  the  fine-to-coarse  sweep  and  once  on 
the  coarse-to-fine  sweep--and  the  total  number  of  nodes  on  the  tree  is  proportional  to  the  number  of  nodes 
at  the  finest  level  (roughly  twice  as  many  for  the  dyadic  tree  and  4/3  as  many  for  the  quad-tree. 


interest  in  which  either  real-time  image  processing  is  required  or  the  extent  of  the  spatial 
region  of  interest  is  enormous. 

3.  We  have  successfully  applied  this  estimation  methodology  to  the  problem  of 
optical  flow  estimation  in  image  sequences,  as  described  in  [25].  In  particular,  in  the 
field  of  computer  vision  a  "smoothness  penalty"  is  generally  used  to  regularize  a  variety 
of  problems  in  image  analysis  including  the  problem  of  estimating  the  motion  from  one 
frame  in  an  image  sequence  to  the  next.  By  interpreting  this  penalty  as  a  "fractal  prior" 
we  were  able  to  replace  it  with  an  analogous  multiscale  fractal  model,  allowing  us  to  use 
our  extremely  fast  optimal  estimation  algorithms  not  only  to  produce  the  optimal 
estimates  but  also  to  produce  these  at  multiple  resolutions  and  to  produce  the 
corresponding  multiresolution  eiTor  covariances  with  an  algorithm  that  was  shown  to  be 
anywhere  from  1  to  2  orders  of  magnitude  faster  than  previously  known  algorithms  for 
512  X  512  images.  In  addition,  as  shown  in  [25],  tliese  multiscale  error  statistics  can  be 
used  to  identify  tlte  optimal  scale  of  resolution  for  the  optimal  estimates  at  each  point  in 
the  image. 

4.  As  described  in  [44],  we  have  also  shown  that  the  model  (1),  (2)  admits  an 
extremely  efficient  whitening  algorithm  which  can  be  used  for  the  efficient  calculation  of 
likelihood  functions  for  these  models.  What  is  interesting  mathematically  about  this 
algorithm  is  the  novel  twist  that  is  required  as  compared  to  the  whitening  approach  for 
standard  temporal  state  space  models.  In  particular,  for  temporal  models,  the  Kalman 
filter  directly  produces  a  whitened  version  of  the  measurements  in  the  form  of  the 
innovations  process,  the  fine-to-coarse  scale-recursive  Kalman  filter  only  whitens  the 
measurement  at  any  panicular  node  with  respect  to  the  data  in  the  subti'ee  below  that 
node.  Consequently,  there  is  still  die  task  of  whitening  each  measurement  with  respect  to 
the  data  outside  its  con'esponding  descendent  subtree.  As  shown  in  [44],  it  is  possible  to 
do  this  in  a  conceptually  pleasing  and  algorithmically  effective  manner,  with  calculations 
that  can  also  be  organized  as  a  coai‘se-to-fine  step  (although  different  from  the  one  used 
in  computing  the  smoothed  estimates).  The  result  is  again  an  algorithrn  with  constant 
computational  complexity  per  node,  for  any  set  of  measurement  data— i.e.,  this  algorithm 
performs  whitening  and  likelihood  calculation  for  problems  in  which  we  have  irregularly 
sampled  measurements  at  one  or  more  resolutions.  This  is  again  in  stark  contrast  to  other 
models  for  random  fields,  such  as  MRF's  for  which  the  calculation  of  likelihoods 
(including  the  so-called  paitition  function)  is  extremely  complex  computationally, 
especially  if  the  model  or  data  have  spatially-varying  chai'acteristics  (e.g.,  if  the  data  are 
irregularly  sampled,  have  gaps,  or  if  the  parameters  of  tlie  model  are  spatially  varying,  as 
they  are  in  most  image  processing  applications  such  as  the  optical  flow  problem 
considered  in  [25]. 

5.  We  have  demonstrated  the  potential  of  our  multiresolution  likelihood  function 
calculation  algorithm  by  using  it,  together  with  our  approximate  models  of  MRF's,  for 
texture  discrimination.  In  particular,  using  our  computationally  efficient  approach  we 
achieve  nearly  the  same  performance  for  distinguishing  MRF  textures,  as  the 
prohibitively  complex  optimal  algorithm  (using  the  exact  likelihood  calculation  for  the 
MRF  model)  and  substantially  outperfoim  well-known  suboptimal  methods  that  have 
been  developed  in  order  to  overcome  this  computational  obstacle.  Since,  as  we  have 
stated,  MRF's  ai-e  an  idealization  to  begin  with,  these  results  argue  once  again  for  the 
attractiveness  of  our  modeling  framework,  as  it  not  only  is  very  rich  in  terms  of  the 
random  fields  that  it  can  faithfully  represent  but  also  leads  to  significantly  superior 
algorithms. 

6.  In  addition,  as  mentioned  in  the  previousl  section,  we  have  recently 
demonstrated  the  potential  of  our  multiresolution  modeling  and  discrimination  techniques 


in  the  context  of  Automatic  Tai'get  Recognition.  In  particular,  in  this  work  we  formed 
multiresolution  SAR  images,  used  our  methodology  to  construct  and  validate  models  for 
clutter  and  man-made  objects,  and  then  used  these  models  as  the  basis  for  likelihood 
calculations  in  order  to  distinguish  natural  clutter  from  objects  of  interest.  The  results  by 
themselves  were  exti’emely  encouraging,  and,  when  combined  with  the  feature-based 
ATR  system  developed  by  Lincoln,  our  method  enhanced  overall  performance  (e.g.  as 
measured  by  false  alarm  rate)  considerably. 

Analysis  and  estimation  of  singular  systems  with  applications  in  efficient  processing 
of  multidimensional  data 

Our  work  in  this  area,  which  is  described  in  [10-16,  35-36,  39-41,  49]  has  grown 
out  of  our  study  of  singulai-  or  descriptor  systems  and,  in  particular  the  rich  class  of 
noncausal  models  that  can  be  represented  by  so-called  boundai'y-value  descriptor 
systems.  Our  early  work  in  this  area  established  a  system  theory  for  such  systems,  while 
our  recent  work  has  built  on  this  in  order  to  develop  efficient  and  highly  parallelizable 
processing  algorithms  for  spatially-distributed  phenomena  (such  as  imagery).  In 
particular  we  have  developed  a  general  theory  for  the  estimation  for  singular  systems-- 
which  is  described  in  its  most  complete  and  general  form  in  the  recently  completed  paper 
[12]-and  have  also  developed  a  number  of  new  parallel  algorithms  for  the  processing  of 
one-dimensional  (i.e.,  time  series)  data  and  for  2-Ddata  sets  [11,13-15,  49]. 

There  are  two  particularly  key  ideas  that  we  have  identified  and  exploited  in  our 
work  in  this  direction.  The  first  is  that  in  dealing  with  implicitly  specified  systems-e.g. 
partial  difference  equations  with  boundary  conditions-the  distinction  between 
measurements  and  dynamic  equations  becomes  indistinct.  In  particular,  both  of  these  can 
be  viewed  as  particular  types  of  noisy  constraints  or  relations  on  the  variables  that  we 
wish  to  estimate,  and  the  key  to  efficient  estimation  algorithm  development  is  the 
identification  of  the  structure  of  this  set  of  constraints  (i.e.,  how  they  couple  together  in 
providing  joint  constraints  on  overlapping  sets  of  variables)  and  exploiting  this  structure 
by  incorporating  consti'aints  recursively  in  as  computationally  efficient  a  way  as 
possitble.  In  particular,  in  [12]  we  have  shown  that  the  relationships  among  noisy 
constraints  and  the  sets  of  variables  that  they  constrain  can  always  be  organized  into  a 
tree  structure,  which  in  turn  directly  leads  to  a  recursive  estimation  structure  as  variables 
and  constraints  ai'e  merged  as  we  progress  through  the  tree.  This  concept  turns  out  to 
unify  many  appai'ently  dispai’ate  approaches  to  solving  estimation  and  least  squares 
problems,  ranging  from  standai'd  Kalman  filtering  to  so-called  nested  dissection  methods 
to  the  radial/angular  recursions  that  we  have  developed  in  our  work  in  [15,  49] 


Specifically,  the  second  idea  we  have  exploited  in  [15,  49]  is  that  one  of  the 
problems  in  multidimensional  estimation  and  processing  is  that  instead  of  dealing  with 
initial  conditions,  as  in  recursive  filtering  for  time  series,  we  must  deal  with  boundary 
conditions  in  higher  dimensions.  An  implication  of  this  observation  is  that  algorithm 
complexity  in  higher  dimensions  depends  not  only  on  model  or  filter  order  but  also  on  the 
dimensions  of  the  data  field  to  be  processed.  This  has  led  us  to  the  idea  of  partitioning 
data  into  manageable- sized  subregions,  having  one  processor  assigned  to  the  processing 
of  data  within  each  subregion,  and  then  combining  the  results  via  interprocessor 
communication.  In  addition,  we  have  developed  a  novel  notion  of  recursion  for  the 
processing  within  each  subregion,  namely  that  of  radial  recursion.  Furthermore,  by 
exploiting  the  estimation-theoretic  inteipretations  of  the  sequential  steps  in  this  recursion 
we  have  been  able  to  identify  methods  for  approximating  the  exact  radial  recursions  that 
yield  near-optimal  results  for  many  estimation  problems  with  dramatically  reduced 
computational  loads.  In  pai  ticulai-,  each  step  in  the  radial  recursion  can  be  viewed  as  an 
optimal  estimation  problem  for  a  1-D  process  consisiting  of  the  values  of  the  2-D  process 
along  die  curve  of  points  coitesponding  to  die  cun’ent  radius  in  the  radial  recursion.  Thus 
by  modeling  this  as  a  1-D  estimation  problem  in  angle  around  this  radial  shell,  we  can  t 
use  1-D  recursive  estimation  methods  to  develop  efficient  and  near-optimal  angle- 
recursive  implementations  for  each  step  in  our  radial  recursion.  In  addition  to  yielding 
excellent  performance  for  many  2-D  estimation  problems,  we  also  believe  that  these  ideas 
will  be  of  value  in  constructing  efficient  preconditioners  for  many  other  problems  such  as 
for  the  solution  of  certain  partial  differential  equations. 


Large-scale  estimation  and  computation  in  remote  sensing  and  space-time  data 
assimilation 

One  of  the  important  characteristics  of  the  multiresolution  methods  that  we  have 
developed  is  that  they  lead  to  algorithms  offering  considerable  computational  advantages 
as  compared  to  other  estimation  formalisms.  This  suggests,  of  course,  that  it  should  be 
possible  to  adapt  and  apply  these  methods  to  exuemely  large-scale  estimation  problems. 
Indeed  the  work  we  have  performed  to  date  demonstrates  this  potential  and  also  points  to 
several  other  important  problems  in  large-scale  estimation  and  computation  that  are 
worthy  of  reseai'ch.  In  particular,  the  work  that  we  have  performed  that  is  relevant  to  this 
portion  of  our  reseai'ch  is  described  in  [11-16,  35-36,  39-41,  53,  58,  64].  The  results  of 
note  ai’e  the  following: 


1.  In  [53,  64]  we  present  the  preliminai-y  results  of  our  first  effort  in  applying  our 
multiscale  estimation  results  to  remote  sensing  problems  of  considerable  size.  In 
particular,  in  this  work  we  describe  the  processing  of  TOPE^^oseidon  satellite  altimetry 
data  over  the  North  Pacific,  a  particulai'ly  large-scale  estimation  problem  for  which  we 
were  able  to  get  a  substantial  amount  of  real  data.  This  example  is  indicative  of  the 
capabilities  of  our  methodology,  as  we  have  now  demonstrated  that  we  can  produce 
estimates  and  associated  error  covariance  and  error  correlation  information  for  extremely 
large  remote  sensing  problems  (in  this  case  we  in  essence  solve  1  million  dimensional 
least  squai-es  estimation  problems  and  chai'acterize  the  error  covariance  for  the  estimated 
field  in  5  seconds  on  a  Sparc  10).  In  addition,  we  have  recently  begun  collaboration  with 
researchers  involved  in  atmospheric  sensing  problems  in  which  we  believe  that  precisely 
the  same  types  of  models  and  algorithms  will  be  applicable  with  similar  results. 

2.  We  have  also  performed  some  resesai'ch  on  the  problem  of  space-time 
estimation  of  random  fields  that  evolve  in  time.  In  particular,  in  [14,  35-36,  39-4()]  we 
consider  dynamic  problems  in  computer  vision,  including  the  problems  of  space-time 
surface  reconsti'uction  and  the  tracking  optical  flow  over  time,  using  a  prior  model  that 
corresponds  to  smoothness  in  both  space  and  in  time  of  the  optical  flow  field.  Roughly 
speaking,  if  we  view  an  entire  optical  flow  field  at  any  point  in  time  as  the  state  of  a 
dynamical  system,  then  for  a  512  x  512  image  what  we  have  is  a  Kalman  filtering 
problem  of  dimension  somewhat  larger  tlian  500,000!!  The  real  problem  here,  of  course, 
is  in  propagating  tlie  eiTor  covariance  information,  since  this  is  both  needed  in  order  to 
fuse  data  effectively  over  time  and  impossible  to  compute  or  store  given  its  enormous 
size.  The  key  observation  used  in  our  work  is  that  the  update  step  of  the  Kalman  filter 
can  be  thought  of  as  using  the  innovations  at  that  time-i.e.,  the  difference  between  the 
measurements  and  our  best  prediction  of  them  based  on  past  data— to  estimate  the  error  in 
the  prediction  of  the  optical  flow  field  based  on  previous  data.  This  is  nothing  more  than 
a  static  random  field  estimation  problem,  where  the  random  field  to  be  estimated  consists 
of  the  one-step  prediction  errors  in  the  optical  flow. 

Solving  this  static  problem,  of  course,  requires  a  specification  of  the  statistical 
sti-ucture  of  the  eiTor  field  and  it  is  precisely  this  statistical  structure  that  is  captured 
explicitly  in  the  enor  covariance  matrix.  What  is  done  in  [14,  35-36,  39-40]  is  to  propose 
the  idea  of  using  an  implicit  and  approximate  specification  for  these  statistics  in  terms  of 
an  MRF  model  for  the  eiTor  field.  With  such  a  model,  then  each  Kalman  filter  update 
coiresponds  to  the  solution  of  an  MRF  estimation  problem,  which,  as  we  have  said,  is 
equivalent  to  solving  an  elliptic  paitial  differential  equation  in  space.  In  addition, 
however,  we  need  a  method  for  propagating  the  error  statistics  over  time— i.e.,  instead  of 
using  the  Riccati  equation  to  propagate  the  error  covai'iance,  we  need  an  alternative 
procedure  that  propagates  the  parameters  of  tlie  MRF  model  for  the  eiTor  field, 
accounting  for  the  effects  of  temporal  dynamics  and  the  assimilation  of  new  data  at  each 
point  in  time.  The  success  in  our  work  is  directly  attributable  to  the  fact  that  we  were 
able  to  derive  a  procedure  for  propagating  such  MRF  models  that  accurately  captures  the 
statistics  of  the  emor  field. 

3.  As  we  have  pointed  out  on  several  occasions,  the  solution  of  MRF  estimation 
problems  couespond  to  the  solution  of  elliptic  partial  differential  equations.  In  addition, 
we  have  also  shown  that  MRF  models  can  be  well-approximated  by  multiresolution 
models,  so  that  estimation  algorithms  that  ai’e  fai-  superior  computationally  can  be  used. 
Putting  these  two  facts  together,  we  ai'e  led  to  the  idea  of  using  our  multiresolution 
methods  for  the  approximate  solution  of  elliptic  pai'tial  differential  equations  or,  in  signal 
processing  contexts,  to  the  approximation  of  2-D  filters  described  by  2-0  difference 
equations.  This  idea  is  the  subject  of  several  parts  of  our  work.  In  particular,  in  [11,  15, 
49,  58]  we  have  shown  that  there  is  a  close  connection  between  our  approach  to 


multiresolution  modeling  and  direct  approaches  to  solving  partial  differential  equations 
such  as  nested  dissection,  the  critical  difference  being  that  we  use  approximate  models  at 
each  stage  of  such  procedures.  For  example,  as  we  have  seen,  rather  than  keeping  all  of 
the  values  of  an  MRF  along  a  sepai'ating  boundary  in  order  to  exactly  decorrelate  the 
values  in  two  subregions,  we  keep  a  coarse  resolution  approximation  to  the  boundary 
values  and  neglect  the  residual  coiTelation.  That  is,  we  develop  an  approximate  and 
comparatively  low-dimensional  description  of  the  linear  constraints  coupling  these  two 
regions  through  the  boundai7.  In  most  of  our  work  we  have  demonstrated  this 
approximation  thi'ough  estimation  problems,  but  in  [58]  we  have  shown  that  acetate 
approximations  can  be  obtained  for  simple  partial  differential  equations  not  explicitly  tied 
to  estimation  problems.  In  addition,  in  [25]  in  which  we  consider  replacing  the 
smoothness  consti'aint  in  optical  flow  estimation  by  a  multiresolution  prior,  we  also 
consider  the  use  of  the  resuldng  output  of  the  multiresolution  estimator  as  an  initial  guess 
for  the  iterative  solution  of  the  vector  Poisson  differential  equation  that  must  be  solved  if 
we  wish  to  solve  the  estimation  problem  using  the  MRF  smoothness-based  prior. 


Multiresolution  and  wavelet-based  methods  for  the  detection  of  abrupt  changes  in 
signals 


In  this  section  we  turn  our  attention  to  the  use  of  wavelets  and  multiresolution 
methods  for  signals  and  systems  that  evolve  in  time  rather  than  in  space.  Once  again  the 
work  that  we  propose  builds  directly  on  our  recent  research  results,  which  are  reported  in 
[8,9,38,48,55,  57,  65,  66].  The  overall  emphasis  of  the  work  in  these  papers  and  reports 
is  on  the  development  of  a  statistically  sound  basis  for  the  use  of  wavelets  in  several 
different  estimation  and  detection  contexts.  Specifically,  our  work  to  date  has  yielded  the 
following: 

1.  We  have  developed,  implemented  and  successfully  tested  a  technique  for  the 
design  of  algorithms  for  the  classification  of  U’ansient  waveforms  based  on  the  use  of 
wavelet  packet  transfonns  [8.9.38]  In  pai  ticular,  the  idea  behind  this  approach  was  to 
take  advantage  of  the  fact  that  wavelet  packets  provide  a  very  flexible  framework  in 
which  to  identify  the  underlying  structure  of  a  signal  in  tenws  of  the  way  in  which  it  is 
best  focused  using  wavelet  packet  bases.  The  wavelet  packet  transform  provides  a 
hierarchy  of  wavefonn  bases,  where  at  the  top  of  the  hierai'chy  the  original  signal 
appears,  thus  providing  maximal  resolution  in  time  and  no  resolution  of  the  frequency 
content  of  the  signal.  As  we  move  down  the  hierai'chy  we  obtain  increasingly  resolution 
in  frequency  and  a  commensurate  decrease  in  resolution  in  time,  culminating  at  the 
bottom  level  with  what  is  in  effect  the  windowed  Fourier  transform  of  the  signal, 
providing  maximal  frequency  resolution  and  no  time  resolution  (over  the  considered 
window  of  data).  The  concept  behind  our  work  was  to  perform  what  we  think  of  as 
partially  coherent,  adaptive  detection,  where  a  completely  coherent  approach  would 
con-espond  to  a  bank  of  matched  filters,  computing  comelations  with  a  fixed  set  of  signals 
(one  per  class  to  be  identified)  followed  by  a  compaitison  to  choose  the  largest  value. 

Such  a  coherent  approach  provides  maximum  focusing  of  information,  but,  of  course, 
requires  a  reasonably  accurate  model  for  each  signal  class.  Indeed  if  such  a  model  is  not 
available  or  if  the  variability  of  the  signals  within  each  class  are  such  that  no  such  simple 
focusing  is  possible,  the  use  of  a  completely  coherent  approach  will  in  general  fail.  Our 
objective  was  to  develop  a  robust  method  for  signal  classification  in  just  such  cases. 


The  appoach  that  we  took  was  to  use  a  d'aining  set  of  data  to  identify  the  level  in 
the  rich  wavelet  packet  libraray  at  which  we  could  effectively  focus  information-i.e.,  to 
identify  those  paits  of  the  wavelet  packet  tableau  in  which  signal  energy  focused.  Since 
calculating  each  wavelet  packet  coefficient  coiresponds  exactly  to  a  coiTclation  or 
matched  filtering  operation,  if  we  were  fortunate  enough  to  find  that  the  energy  always 
focused  in  exactly  the  same  small  sets  of  coefficients,  then  we  could  indeed  use  a 
completely  coherent  matched  filtering  approach.  If  this  were  not  the  case,  then  we  would 
need  to  follow  the  calculation  of  the  wavelet  coefficient  tableau  by  the  incoherent 
summing  of  energies  over  sets  of  coefficients.  In  the  specific  approach  that  we  adopted, 
we  summed  energies  over  entire  sets  of  coefficients:  at  the  top  of  the  tableau  this 
corresponds  simply  to  the  calculation  of  the  overall  energy  in  the  signal,  at  the  next  level 
it  corresponds  to  calculation  of  energies  over  two  orthogonal  signals  corresponding  to 
(crude)  low-pass  and  high-pass  versions  of  the  signal,  etc.,  where  at  the  bottorn  we  are 
simply  calculating  tlie  squai-ed-magnitude  of  the  Fourier  ttansform  at  each  individual 
frequency. 

Using  training  sets  of  data  for  each  signal  class,  we  then  (a)  used  SVD  analysis  to 
identify  the  regions  in  this  energy  tableau  in  which  energy  focused  for  each  signal  class; 
and  (b)  used  the  dominant  singular  vectors  of  energy  disu-jbution  in  each  class  to 
determine  the  critical  differences  in  these  energy  disnibutions  from  one  class  to  the  next, 
thereby  providing  the  basis  for  robust  signal  classification.  This  approach  was  tested  on 
real  sonar  data  for  the  detection  and  classification  of  marine  biological  sounds,  a,nd 
decision  rules  using  a  remarkably  small  number  of  wavelet  packet  energy  statistics 
resulted  in  excellent  perfoimance  when  tested  on  other  data  sets. 

2.  We  have  also  recently  completed  a  project  [55,  65]  on  the  use  of  the 
continuous  wavelet  transfomi  (CWT— i.e.,  the  set  of  multiscale  signals  obtained  by 
convolving  the  original  signal  with  scaled  versions  of  the  wavelet  but  without  the 
subsampling  that  is  done  to  form  the  orthonomial  wavelet  transform)  for  the  detection  of 
abui-pt  changes  in  signals,  in  this  case  based  on  the  use  of  tlie  CWT  extrema  processing 
methods  proposed  by  Mallat.  The  motivation  for  Mallat  s  methodology  stems  from  two 
facts:  (a)  although  not  tme  in  certain  pathological  cases,  a  signal  can  usually  be 
completely  characterized  and  reconstructed  from  the  knowledge  of  the  locations  and 
values  of  the  extrema  of  its  CWT  at  all  dyadic  scales;  and  (b)  the  set  of  extrema  across 
scale  that  coiTespond  to  a  particular  point  in  the  original  signal  characterize  the  nature  of 
the  singularity  of  the  signal  at  that  point.  In  particulai',  the  magnitudes  of  these  extrema 
vary  geometrically  across  scale,  with  an  exponent  determined  by  the  Lipschitz  exponent 
of  the  original  signal  at  that  point.  For  example,  discontinuites  have  Lipschitz  exponent 
of  0  (so  that  the  exU’ema  should  in  principle  have  constant  value  across  scale),  and  "noise¬ 
like"  or  impulsive  behavior  will  have  Lipschitz  exponent  of  -1,  coiyesponding  to  decaying 
extrema  magnitudes  as  we  move  to  coarser  scales  (at  which  such  high  frequency  effects 
are  blurred  out).  Thus,  as  Mallat  argued,  if  one  can  "chain"  the  extrema  across  scale-i.e. 
identify  which  extremum  at  one  scale  goes  with  which  at  the  next,  and  so  on-then  we  can 
estimate  Lipschitz  exponents,  discard  those  with  negative  values  as  being  noise,  and 
reconstruct  a  "denoised"  version  of  the  signal. 

In  our  work  we  have  taken  a  rather  different  look  at  the  use  of  these  extrema.  One 
of  our  objectives  has  been  to  begin  to  provide  a  statistical  justification  for  tliis  type  of 
method.  Our  conjecture  has  been  that  these  methods,  while  certainly  not  optimal  for  any 
obvious  specific  choice  of  models  for  signal  and  noise  (e.g.,  such  as  detemiminstic 
signals  in  Gaussian  white  noise),  may  very  well  be  extremely  robust  to  the  detailed 
behavior  of  tlie  statistics  of  the  signal  and  noise.  The  context  in  which  we  have  chosen  to 
investigate  this  is  the  detection  of  abrupt  changes  in  signals,  in  particular,  the  robust 
identification  of  knots  in  spline  functions  obseiwed  in  noise.  This  is  a  prototypical  abrupt 
change  detection  problem  of  interest  in  a  variety  of  applications  including  failure 
detection.  If  we  know  that  the  measurement  noise  is  Gaussian,  then  the  generalized 


likelihood  ratio  method  provides  the  optimal  algorithm  for  detecting  and  locating  the 
spline  knots.  However,  it  is  well  known  that  such  a  method  can  degrade  considerably  if 
the  real  noise  has  "heavy  tails".  The  appoach  that  we  have  taken,  then,  is  to  look  at  CWT 
extrema  as  the  basis  for  detecting  and  locating  these  knots. 

Moreover,  we  have  now  shown  that  there  is  a  very  novel  way  in  which  the 
chaining  of  exti'ema  across  scale  and  the  location  of  the  knots  can  be  folded  into  an 
optimal  estimation  framework.  In  particular,  depending  upon  the  specific  wavelet  chosen 
and  the  order  of  the  spline  (i.e.,  the  order  of  the  derivative  that  is  discontinuous  at  the 
knot  location),  the  noise-free  locations  of  the  knots  across  scale  is  a  completely 
deterministic  trajectoi7.  This,  together  with  the  fact  that  the  same  is  true  for  the 
magnitudes  of  these  extrema  suggest  the  adaptation  of  techniques  for  multitarget  traking 
and  data  association  to  the  chaining  of  extrema  across  scale  and  the  detection  and 
localization  of  knots  across  scale.  For  example,  if  we  focus  on  linear  splines  (with 
discontinuities  in  first  derivatives)  and  use  a  symmetric  wavelet  (e.g.,  the  second 
derivative  of  a  Gaussian),  then  the  extrema  corresponding  to  a  knot  should  occur  at 
exactly  the  same  location  at  each  scale  (i.e.,  a  sti'aight  line  trajectory  across  scale)  and 
should  double  in  size  at  each  successively  coarser  scale).  Thus  for  each  hypothesized  knot 
we  have  a  2-D  state  model  across  scale  and  then  can  use  optimal  multiple  hypothesis 
testing  methods  in  order  to  chain  the  extrema,  discai'd  extraneous  ones  or  ones 
cori'esponding  to  non-spline-like  behavior,  and  identify  the  knot  locations  and  rnagnitudes 
of  derivative  discontinuities.  Our  experimental  results  show  that  this  technique  is 
exceedingly  robust,  both  in  the  presence  of  Gaussian  noise  and  in  the  presence  of  heavy¬ 
tailed  noises. 


Data  fusion  and  inversion 

We  next  describe  research  directions  that  involve  the  processing  of  measurement 
data  that  is  characterized  by  the  fact  that  (a)  they  come  from  several  different 
measurement  sources  with  different  resolutions;  and/or  (b)  the  measurement  mechnisms 
used  involve  "probing"  of  the  phenomenon  to  be  imaged-i.e.,  they  provide  non-local 
measurements  of  the  medium  fo  be  imaged.  Problems  of  this  type  are  the  rule  rather  than 
the  exception  in  many  signal  processing,  estimation,  and  detection  applications  including 
nondestmctive  evaluation  and  radar  imaging  (such  as  synthetic  aperture  radar  processing) 
and  other  remote  sensing  problems  (in  which  the  probing  energy  is  applied  at  a  distance 
and  therefore  yields  nonlocal  data). 

Our  work  in  this  ai'ea  is  described  in  a  number  of  papers  and  reports  [33,  37,  45- 
48,  50-52,  59-61,  63].  The  basic  idea  behind  this  work,  as  it  was  for  the  work  described 
in  the  preceding  section,  is  to  develop  multiscale  methods  for  data  fusion  and  inversion. 

In  this  case,  however,  our  work  has  focused  directly  on  the  use  of  wavelet  transform  ideas 
rather  than  on  the  use  of  pyramidal  scale-recursive  models  defined  on  trees. 

Inverse  and  multiresolution  fusion  problems  present  a  number  of  important 
challenges,  and  our  work  in  this  ai'ea  has  been  to  investigate  the  use  of  multiresolution 
methods  to  provide  a  framework  to  deal  with  these.  In  the  first  place,  by  their  very 
nature,  inverse  and  multiresolution  fusion  problems  ai'e  confronted  with  a  mismatch 


between  the  domain  of  the  data-which  is  nonlocal  or  multiresolution  and  the  desired 
domain  of  the  reconstmction.  This  mismatch  manifests  itself  in  serious  computational 
and  conceptual  problems  if  attacked  by  brute  force  methods.  For  example,  if  we  try  to 
regularize  such  problems  by  using  a  smootliness  or  related  prior  MRF  model  directly  in 
the  domain  in  which  we  wish  to  perfoirn  reconstruction,  the  resulting  variational  problem 
is  extraordinarily  complex— in  general  an  elliptic  partial  differential-integral  equation 
(where  the  differential  part  comes  from  the  MRF  model  and  the  integral  part  from  the 
nonlocal  nature  of  the  measurements).  Not  only  is  this  complex  computationally,  but  it 
also  makes  the  computation  of  eiror  statistics  prohibitively  complex  and,  further, 
obscures  questions  such  as  determining  the  relative  value  of  data  from  different  sources, 
identifying  the  best  resolution  for  reconstruction  (which  may  be  space-varying  if  we  have 
irregularly  and  differently  sampled  data  sources),  etc. 

Our  work  has  had  issues  such  as  these  in  mind  as  we  have  explored  the  use  of 
wavelets  for  these  problems,.  What  wavelets  in  essence  provide  is  a  mechanism  for 
translating  both  our  data  and  in  some  cases  our  reconstruction  domain  as  well  to  a 
common  setting  in  which  descriptions  of  both  measurements  and  our  prior  knowledge 
are  local.  In  this  way  we  then  can  addi'ess  many  important  questions  such  as  those 
mentioned  above  with  comparative  ease  and  with  tools  that  greatly  facilitate  our  insight. 

In  pai'ticular,  in  our  work  to  date  we  have  accomplished  the  following: 

1.  In  pait  of  our  work  [33,  37, 46,  52,  60,  61]  we  have  examined  several  specific 
multisensor  inverse  problems,  namely  1-D  deconvolution  problems  and  2-D  linearized 
electromagnetic  inverse  scattering  problems,  described  by  linear  integi’al  equations.  In 
this  work  we  have  developed  wavelet-based  data  fusion  and  inversion  algorithms  that  (i) 
are  efficient  in  that  diey  take  advantage  of  the  properties  of  wavelets  to  compress  integral 
operators  as  in;  (ii)  they  allow  us  to  use  fractal  or  other  multiresolution  prior  models 
described  directly  in  the  wavelet  coefficient  domain  (as  described,  for  example,  in  [17, 
22,  28]);  to  regularize  the  inversion  and  fusion  operations  without  increasing 
computational  complexity;  (iii)  they  allow  us  to  calculate  eiTor  statistics  and  thus  to 
identify,  at  every  point  in  space,  the  relative  importance  of  each  measurement  source  (a 
useful  tool  for  measurement  placement  and  design)  and  the  optimal  scale  for 
reconsti'uction.  In  particular  by  applying  what  may  be  different  wavelet  transforms  to 
each  data  source  and  to  the  reconsti’uction  domain  itself,  we  identify  a  framework  in 
which  the  prior  statistics  and  measurements  ai'e  both  local.  Of  course  for  deconvolution 
problems  with  complete  data-i.e.,  with  densely  sampled  and  complete-coverage 
measurements— we  could  simply  apply  FFT's  to  accomplish  the  sarne  thing  as  long  as  the 
prior  model  that  is  used  is  that  of  a  stationai-y  random  field.  What  is  paiticularly 
interesting  about  the  approach  we  have  developed  is  that  (a)  it  allows  nonstationarity  of 
the  prior  model  (of  critical  importance  in  many  nondestructive  evaluation  problems  in 
which  we  expect  differing  levels  of  fine  scale  behavior  in  different  regions);  (b)  it  allows 
incomplete  and/or  sparse  data  which  may  differ  from  sensor  to  sensor^;  (c)  it  deals  with 


^This  is  another  innovation  of  our  approach  as  compared  to  other  recent  attempts  to  use  wavelets  for 
inverse  problems  (e.g.,  [W])-i.e.,  what  is  particular  to  our  approach  is  not  only  the  use  of  the  methods  of 
optimal  estimation  to  provide  error  statistics  and  detailed  analyses  based  on  these  (as  in  Fig.  5)  but  also  the 


nonconvolutional  kernels  as  well;  and  (d)  it  produces  the  detailed  multii-esolution 
statistical  information  mentioned  previously. 

2.  In  the  other  portion  of  our  work  [45-48,  51,  59,  63]  we  have  focused  on 
problems  of  tomographic  reconstruction,  i.e.,  on  problems  in  which  the  measurement  data 
are  line  integrals  through  the  object  to  be  imaged.  These  measurements  are  usually 
organized  as  "projections",  i.e.,  as  sets  of  parallel  line  integrals,  which  can  then  be  viewed 
as  samples  of  the  1-D  "projection”  of  the  function  onto  an  axis  perpendicular  to  the  lines 
of  integration.  Regularization  of  tomographic  reconstruction  problems,  either  when  the 
measurement  data  are  noisy  or  sparse  is  a  notoriously  difficult  problem,  due  to  the  severe 
nonlocality  of  the  measurements  and  thus  to  the  mismatch  between  measurement  and 
object  domain.  What  we  have  done  in  our  work  is  to  overcome  this  problem  by 
transforming  both  the  data  and  the  object  to  a  common  intermediate  domain  in  which  we 
can  then  develop  wavelet  based  methods.  In  particular,  as  in,  we  note  that  the  exact 
inversion  from  complete  tomographic  measurement  data  to  object  dornain,  that  is  the 
inverse  Radon  transfoiTn,  can  be  viewed  as  the  composition  of  a  1-D  filtering  operation, 
namely  the  so-called"ramp"  filter,  on  each  projection,  followed  by  a  so-called  back- 
projection,  which  is  nothing  more  than  the  representation  of  the  object  in  a  nonstand^d 
basis,  termed  the  "natural  pixel"  basis  in.  What  we  have  done  in  essence  is  to  use  this 
basis  as  a  common  stai'ting  point  and  then  have  further  applied  1-D  wavelet  transforms  to 
both  the  projection  data  and  to  the  representation  of  the  object  in  the  natural  pixel  basis. 
What  this  does  is  (a)  provide  a  model  for  our  object  in  what  we  have  termed  the  "natural 
wavelet  basis";  and  (b)  essendally  diagonalizes  the  ramp  filtering  operation-i.e.,  the  map 
from  wavelet-ti-ansfonned  data  to  natural  wavelet  coefficients  is  essentially  diagonal.  An 
important  point  here  is  that  this  approach  is  quite  different  from  other  approaches  to  the 
use  of  wavelets  in  tomography,  since  our  approach  uses  1-D  wavelet  transforms  for  both 
the  measurement  data  and  for  the  object  representation  (thanks  to  our  use  of  natural 
pixels).  This  has  profound  consequences  both  in  terms  of  the  simplification  it  brings  to 
the  tomographic  inversion  problem  (which  again  greatly  facilitates  estimation,  error 
analysis,  identifying  scales  for  reconstruction,  etc.)  and  in  terms  of  our  ability  to  deal  with 
situations  in  which  projections  ai’e  available  over  a  limited  range  of  angles  or  are  sparsely 
or  in-egularly  sampled  in  angle). 


fact  that  we  have  been  able  to  accommodate  sparse  and  incompletely  sampled  data  by  tailoring  possiblty 
different  wavelet  uansfonns  to  each  data  source. 
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